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ABSTP  ACT 


Adaptive  multichannel  prediction  filtering  has  been  completed 
on  four  data  samples,  and  adaptive  maximum-likelihood  signal  extraction 
has  been  done  on  one  sample. 

Comparison  of  adaptive  results  with  tho^e  obtained  from 
processing  the  same  data  with  stationary  filters  (nonchanging  filters  designed 
from  correlation-function  estimates)  shows  that  the  adaptive  filters  approach 
the  stationary  filters  as  kg  (the  rate-of-convergence  parameter  in  the  adaptive 
algorithm)  approaches  0.  For  larger  values  of  k^,  adaptive  prediction-error 
filtering  does  better  than  stationary  filters  on  nontime -stationary  data,  but 
stationary  filters  are  better  on  data  samples  which  appear  to  be  time-uniform. 

The  performance  of  an  adaptively  designed  maximum-likelihood 
filter  was  shown  to  be  essentially  equivalent  to  that  of  a  maximum-likelihood 
filter  which  was  conventionally  designed  from  correlation-function  estimates. 
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SECTION  I 

INTRODUCTION  AND  SUMMARY 

This  report  presents  initial  results  in  a  study  of  the  adaptive 
filtering  of  seismic  array  data.  There  is  a  brief  discussion  of  the  theo¬ 
retical  basis  of  the  adaptive  algorithm  and  its  application  to  multichannel 
prediction  and  maximum-likelihood  filtering.  Adaptive  multichannel  pre¬ 
diction  filtering  has  been  completed  on  four  data  samples,  and  maximum- 
likelihood  signal  extraction  has  been  done  on  one  sample.  Adaptive  filter 
results  are  compared  with  those  obtained  from  stationary  filters,  i.  e.  , 
from  nonchanging  filters  designed  from  correlation  function  estimates. 

Plots  of  both  mean-square-error  vs  k  (the  rate-of- convergence 

parameter  in  the  adaptive  algorithm)  and  of  mean-square-error  vs  time 

indicate  that,  in  the  limit  as  k  approaches  0,  the  adaptive  filters  approach 

s 

the  stationary  Wiener  filters.  For  larger  values  of  kg,  the  mean- square- 

error  of  the  adaptive  prediction  is  found  to  be  greater  than  the  Wiener  mean- 

square-error  for  some  data  samples  and  less  for  other  samples.  The  data 

characteristic  which  defines  the  exact  behavior  of  the  mean-square-error- 

vs-k  curve  appe  .rs  to  be  related  to  the  time-stationarity  of  the  data, 
s 

The  performance  of  an  adaptively  designed  maximum- likeli¬ 
hood  filter  was  shown  to  be  essentially  equivalent  to  that  of  a  maximum- 
likelihood  filter  which  was  conventionally  designed  from  cor  relation -function 

estimates. 
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SECTION  II 

THEORY  OF  ADAPTIVE  FILTERING 

To  derive  the  Widrow  adaptive-filter  algorithm  without 
becoming  too  involved  with  notation,  the  simple  problem  of  single -channel 
prediction  will  be  used  to  illustrate  the  main  features  of  the  algorithm. 

Later,  the  algorithm  will  be  expanded  to  the  multichannel  case;  and  its 
application  to  maximum- likelihood  signal  extraction  will  be  discussed. 

A.  SINGLE- CHANNEL  PREDICTION 

Consider  a  single  channel  of  sampled  data  points,  x^  and  let 
the  problem  be  to  take  p  consecutive  values  of  xi  and  use  them  to  predict 
the  value  of  the  next  point.  To  do  this,  these  values  of  x.  are  considered 
to  be  components  of  a  p-dimensional  column  vector, 

/  NT  (2-1) 

— n  “  VVp+r  Xn-p+2'  Xn-p+3 . V 

To  predict  the  value  of  the  next  point,  xn+1,  the  scalar 
product  is  formed  from  the  data  vector  £n  with  the  prediction- filter  vector, 

F  =  (tj.  1 2,  £3 . £p)T  <2-2> 


The  error  in  the  prediction  of  Xn+1  18 


e 


n+1 


n+1 


X 

—  n 


(2-3) 


o 


and  the  squared  error  is 


n+ 1 


T  T 

=  F  X  X  F 

—  — n  — n  — 


T  2 

2F  X  x  +  x 
—  — n  n-U  n+1 


(2-4) 


The  expected  value  of  e  is  given  by 


e  =F  X  X  F  -  2  F  X  x  xl  +  x  X1  (2-5) 

n+1  —  — n  — n  —  —  — n  n+1  n+1  ' 


2 

Fquation  (2-5)  shows  the  expected  value  of  to  be  representable  as  a 

»  dimensional  quadratic  surface  in  F.  The  value  of  F  at  which  the  minimum 
2 

of  the  expected  surface  occurs  is  the  optimum  filter  in  the  least- squares 

sense.  Adaptive  processing  starts  with  some  arbitrary  filter  vector  F  and 
iteratively  converges  toward  the  optimum  F.  In  this  report,  the  (n+l)th 
iteration  of  F,  F^j  is  found  from  F^  by  the  method  of  steepest  descent,  which 
can  be  summarized  in  the  following  two  rules: 


1)  Move  opposite  the  direction  of 

the  gradient  of  the  e  2  surface 

n+ 1 

2)  The  distance  moved  in  this  direction 
is  proportional  to  the  magnitude  of  the 
gradient,  and  the  constant  of  propor¬ 
tionality  is  called  k 


Cast  into  equation  form,  these  two  rules  yield  the  steepest- 
descent  algorithm 


k 

s 


V  e 


2 

n+1 


F 

—  n 


(2-6) 


Ir  practice,  the  gradient  of  the  expected  value  of  e  ^  is  not 
known.  However,  the  Widrow  adaptive -filte r  algorithm  meets  this  problem 
by  making  the  approximation 


T 

to  F  ,  giving 


V  e  is  obtained  by  differentiating  Equation  (2-4)  with  respect 


V  e  2  =  2X  XTF  -  2  X  x  ,  =  -2  ~  .  .  X  (2-7) 

n+1  —  n  —  n  —  n  —  n  n+1  n+l  —  n 


Combining  Equations  (2-7)  and  (2-6)  gives  the  Widrow  single 


channel  adaptive  algorithm  of 


F  =  F  +  2k  e  X 
—  n+1  —  n  s  n+1  — n 


(2-8) 


B.  MULTICHANNEL  PREDICTION  FILTERING 

The  multichannel  case  is  shown  diagrammatically  in  Figure 
II- 1.  Here,  C  channels  of  time- series  data  are  filtered  by  C  digital  filters 
to  produce  an  output  which  is  supposed  to  approximate  y^,  the  desired  out¬ 
put.  In  this  diagram,  the  subscript  n  used  on  the  filter-column  vectors, 
the  input  time-series  data  vectors,  the  desired  output,  and  the  prediction 
error  indicates  their  values  at  the  n**1  time.  The  subscript  is  necessary  on 
the  filter  vector  since  the  filter  weights  change  with  time  in  the  adaptive  algorithm. 

The  derivation  of  the  multichannel  algorithm  follows  easily 
from  the  single-channel  algorithm  if  a  new  column  vector  X*n  is  made  by 
placing  the  column  vectors  X^(i),  i=l  to  C,  on  top  of  each  other  and  like¬ 
wise  forming  a  new  column  vector  _F*  from  the  ¥_  (i),  i=l  to  C. 


In  terms  of  the  new  vectors,  the  prediction  is  given  by  a 
scalar  product  of  and  _F  .  Thus  the  prediction  error  can  be  written  as 


(2-9) 


Using  the  same  general  procedure  as  used  in  deriving  the 
single-channel  algorithm,  the  multichannel  adaptive-filter  algorithm  is  then 
found  to  be 


F'  +  2  k  e  X' 
—  n  s  n  — n 


(2-10) 


C.  MAXIMUM- LIKELIHOOD  SIGNAL  EXTRACTION 

The  transformation  of  maximum- likelihood  processing  into 
problems  of  prediction  is  first  considered.  This  transformation  is  desirable 
so  that  the  adaptive- prediction  method  previously  described  can  be  used  to 
design  maximum-likelihood  filters. 

Suppose  we  have  an  N-channel  problem  and  an  L-point  filter 
*ij*  where  i=l»  ...  i  N  and  j=l,  ...»  L.  We  wish  to  minimize  the  output  of 
the  filter 

yt-3  =  £  fijXi,t-j  (2-H) 

ij 

where  x  is  the  output  of  seismometer  i  at  time  t.  The  criterion  that  v 

i-  s 

be  an  unbiased  estimate  at  time  t-s  of  the  signal,  which  is  assumed  constant 
acrosj  channels,  leads  to  the  constraints 


(2-12) 


where 


and 


6  .  =0  for  j  4  s 

js 


6.  =  1  for  j  =  s 

js 


The  constraints  may  be  expressed  as 

N 

fi-  = 6  -  y. 

ij  JS  t-J  1J 

i=2 


and  substituted  into  Equation  (2-11).  This  gives 


N 


N 


y.  =  Y  k  -  y  x-  .  +  y  y  i. . x. 

t-s  Z-/  \  JS  Z-<  ljj  l,t-j  ZJ  lj  i,  t-j 

j  i=2  7  i=2  j 


which  can  be  simplified  to  the  form 


t-s 


N 

X  -  V]  5^  f. .  [x.  .  -  X.  .) 

l.t-s  Z-rf  Z^  ij  \  l,t-j  l,  t-j / 

i=2  j 


(2-13) 


Referring  to  Equation  (2-3),  Equation  (2-13)  can  be  recognized 
as  a  prodiction-error  equation-  Thus,  the  maximum -likelihood  output  y 

t-s 

can  be  considered  the  error  in  predicting  x.  by  filters  operating  on  the 

L  j  t"  8 

set  of  data  (x.  .  -  x  .),  where  the  filters  are  no  longer  subject  to 

i,t-j  i,  t-j 

constraints. 

Equations  could  now  be  written  to  specify  the  filters  f  in 
terms  of  the  covariances  of  the  data  x.  .  These  equations  would  be  equiva- 

If  t 

lent  to  the  conventional  system  of  equations  but  of  order  (N-l)  L  instead  of 
NL.  However,  the  purpose  of  this  section  is  to  determine  adaptively  the 
maximum -likelihood  filters. 
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Referring  to  the  algorithm  of  Equation  (2-8)  which  resulted 
from  Equation  (2-3),  an  adaptive  algorithm  follows  immediately  from 
Equation  (2-13).  The  resulting  maximum -likelihood  adaptive  algorithm 
is 


Vt+1) 


f  .  (t)  +  2k 
ij  s 


t-s 


(2-14) 


The  adaptive  maximum-likelihood  results  in  this  report  are 
derived  by  using  Equation  (2-14).  Obviously,  there  are  other  ways  of  com¬ 
bining  Equation  (2-11)  and  the  constraints  of  Equation  (2-12)  into  a  single 
prediction-error  equation.  For  example,  one  could  solve  for  f^  and  substi¬ 
tute  into  Equation  (2-11),  thereby  predicting  channel  3  from  traces  made  by 
subtracting  the  remaining  channels  from  channel  3.  All  of  these  different 
ways  of  producing  a  prediction-error  equation  are  equivalent  in  the  sense  that 
the  resulting  equations  specifying  the  filters  in  terms  of  the  covariances  of 

the  data  x.  define  equivalent  filters.  The  adaptive  algorithms  resulting 
ii  t 

from  the  different  prediction- error  equations  will  be  different,  however. 

All  of  these  algorithms  are  determined  by  reducing  the  dimen¬ 
sion  cf  the  problem  by  substituting  in  the  constraint  equation  and  then  by 
finding  the  gradient  for  the  reduced  set  of  filter  coefficients.  The  constraints 
are  satisfied  by  actually  using  the  projection  of  the  subset  gradient  on  the 
constraint  plane.  A  better  method  is  obtained  by  finding  the  gradient  at  a 
point  in  time  for  the  complete  set  of  coefficients  and  projecting  this  gradient 
on  the  constraint  plane.  This  "full"  gradient  algorithm  can  be  derived  from 
Equations  (2-11)  and  (2-12)  by  adding  and  subtracting 
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where 


N 

F  £  xi.t-j 
i=l 


Thus, 


y.  =  x  -  f. .  (x  .  -  x.  .) 

t-S  t-S  ^  IJ  t-J  1,  t-J 

ij 


(2-15) 


which  is  in  the  form  of  a  prediction-error  equation  so  that  the  corresponding 
adaptive  algorithm  is 


yt+1>  ■  V*>  +  (2‘16) 

Note  that  the  constraints  are  alv'ay3  satisfied  if  the  iteration  is  started  with 
filters  satisfying  the  constraints. 

The  final  report  will  give  a  more  complete  description  of 
maximum-likelihood  processing  by  the  adaptive  method  of  Equation  (2-16). 


SECTION  III 

EXPERIMENTAL  RESULTS 


A.  PREDICTION  FILTERING 

Adaptive  multichannel  prediction  filtering  has  been  completed 
on  four  data  samples.  Information  about  these  data  —  which  consist  of  UBO 
road  noise,  UBO  normal  noise,  the  center  and  first  ring  of  LASA  subarray 
Bl,  and  13  channels  of  array  data  —  is  given  in  Table  III-  1  •  These  data 
samples  also  have  been  processed  using  Wiener  prediction  filters. 

In  the  filtering  program,  the  data  in  each  trace  are  scaled 
by  l/(rms  value  of  that  trace)  so  that  the  variance  of  all  data  traces  is  1. 
Thus,  results  of  processing  on  the  different  data  samples  may  be  compared 
directly. 

Results  for  each  data  sample  are  presented  in  the  form  of 

three  figures.  The  first  figure  shows  mean- square-error  vs  kg  and  the 

Wiener  filter  mean-square-error.  The  second  shows  mean-square-error  vs 

time  for  the  Wiener  filter,  the  adaptive  filter  with  the  large  k  ,  and  the 

s 

adaptive  filter  with  the  small  kg.  It  should  be  noted  that  the  origin  in  these 
figures  does  not  correspond  to  zero  mean-square-error.  The  third  figure 
is  a  plot  of  the  channel  to  be  predicted  plus  the  prediction  and  prediction 
error  of  the  Wiener  and  large  and  small  k  filters. 

S 

Power  spectra  of  the  channel  being  predicted  and  of  the 
Wiener  and  adaptive  error  traces  have  been  computed  for  UBO  road  noise 
and  LASA  subarray  Bl. 

1.  UBO  Road  Noise 

A  major  highway  passes  within  a  few  miles  of  the  northwest 
extent  of  the  UBO  array.  The  UBO  road  noise  (Figure  III-l),  which  is  pre¬ 
dominantly  Rayleigh  energy  believed  to  originate  along  this  highway,  does  not 
arrive  as  a  plane  wavefront,  is  time  varying,  and  is  attenuated  across  the  array. 


i 
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SUMMARY  OF  DATA  PROCESSED 


Prior  to  any  multichannel  filtering,  the  data  were  prefiltered 
with  an  antialiasing,  slightly  prewhitening  filter  and  resampled  to  a  sample 
period  of  72  msec. 

A  27-point  Wiener  filter  with  its  output  point  at  the  center  of 
the  filter  had  been  designed  previously1  from  these  data  to  predict  channel 
10  using  channels  1  through  9.  The  mean-square  prediction  error  of  the 
Wiener  filter,  when  applied  to  the  normalized  design  data,  was  0.  147. 

Two  adaptive  processing  runs,  consisting  of  several  passes 
through  the  data  for  each  run,  were  made  on  this  road-noise  sample.  At 
the  beginning  of  the  first  pass  of  each  run,  the  filter  coefficients  were  set 
to  0;  on  successive  passes,  the  coefficients  initially  were  equal  to  their 
values  at  the  end  of  the  previous  pass.  The  first  run  consisted  of  nine  passes 
where  kg  equaled  0.002  on  the  first  pass  and  was  scaled  by  two-thirds  on  each 
successive  pass,  ending  with  a  value  of  0.  000117  after  eight  passes.  For  the 
ninth  pass,  kg  equaled  0.  00005.  In  the  second  run,  five  passes  were  made, 
with  kg  being  equal  to  0.0005  on  the  first  pass;  this  was  incremented  by 
0.  0005  for  each  additional  pass.  Figure  III-2  plots  as  a  function  of  kg  the 
mean-square  prediction  error  for  each  of  these  passes,  excluding  the* first 
two  in  each  run  which  were  learning  passes. 

The  fact  that  the  adaptive  filter  does  better  than  the  Wiener 
filter  for  intermediate  values  of  kg  is  attributed  to  the  nonstationarity  of 
UBO  road  noise.  Figure  III- 3  shows  mean-square-error  over  50-point 
intervals  as  a  function  of  location  in  the  data  sample.  The  Wiener  and  small 
kg  adaptive  plots  are  similar,  while  the  plot  for  the  strongly  adapting  filter 
appears  to  be  independent  of  the  others.  This  result  supports  the  hypothesis 
that  UBO  road  noise  is  highly  nonstationary. 
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Figure  III -4  shows  the  energy  of  the  power  spectrum  of  channel 
10  to  be  concentrated  around  2.  5  cps.  The  2.  5-cps  peak  is  reduced  least  b\ 
the  Wiener  filter  and  is  reduced  most  by  the  kg  =  0.  0015  filter,  with  the 
k  =  0.  00005  filter  falling  between.  Additional  evidence  of  the  nonstationarity 
of  the  data  is  the  dissimilarity  between  the  Wiener  and  the  kg  =  0.  0015  error 
spectra. 

Figure  III-5  shows  channel  10  (the  channel  being  predicted)  as 

well  as  the  prediction  and  prediction  err~.  for  the  Wiener  and  small  and  large 

k  filters, 
s 

2.  UBO  Normal  Noise 

A  sample  of  UBO  data,  called  normal  noise  because  it  appears 
to  travel  across  the  array  as  unattenuated  plane  waves,  is  shown  in  Figure 
III-6.  The  UBO  normal-noise  sample  was  prefiltered,  resampled,  normalized, 
and  Wiener-filtered  with  the  same  procedures  used  for  the  UBO  road  noise. 

The  normalized  mean-square  prediction  error  of  the  Wiener  filter  was  0.28. 

Three  adaptive  processing  runs  were  made,  one  with  eight 

passes  and  two  with  one  pass  with  the  filter  weights  being  initially  set  to  0 

at  the  beginning  of  each  run.  For  the  eight  passes,  kg  had  the  values  of 

0.0015  (learning),  0.  0015,  0.  001,  0.0005,  0.00025,  0.000125,  0.  00005,  and 

0.  002.  Values  of  k  for  the  second  and  third  runs  were  0.0025  and  0.C03, 

s 

respectively.  Figure  IU-7  shows  the  mean-square-error  from  all  runs 
(except  the  first  learning  pass). 

The  mean-square-error  -  vs  -  kg  curve  for  these  data  differs 
from  the  corresponding  curve  for  road  noise.  Mean-square-error  increases 
with  increasing  kg  up  to  approximately  kg  =  0.  001  and  decreases  with  in¬ 
creasing  k  from  k  =  0.  001  to  0.  0025.  Mean-square-error  increases  above 
k  *  0.  0025  where  the  algorithm  becomes  unstable. 


CHANNEL  10 

WIENER  ERROR 
k$-  0.0015  ERROR 

k  -  0.00005  ERROR 


Figure  III-5.  UBO  Road  Noise,  Wiener  and  Adaptive  Filter  Outputs 
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Normal  Noise,  Mean-Square-Error  Vs  k 


Increasing  mean-square-error  with  increasing  k  is  the 

s 

expected  result  for  time-stationary  data,  since  a  larger  k  corresponds  to 

s 

a  smaller  time  constant.  Thus,  the  effective  length  of  the  data  used  in  de¬ 
signing  the  filter  is  decreased,  which  means  statistically  that  the  misdesign 
and  MSE  of  the  filter  are  increased. 

The  dip  in  the  MSE  at  k^  =  0.  0025  in  Figure  II1-7  is  surprising. 
One  possible  explanation  for  this  phenomenon  is  that  the  data  are  time  varying, 
with  a  time  constant  which  matches  the  adaptive  time  constant  corresponding 
to  k^  =  0.  0025.  This  is  probably  not  the  correct  reason  for  the  dip  since  a 
similar  effect  is  seen  in  other  MSE-vs-ks  curves  (Figures  III- 1 1  and  III- 16). 

A  more  likely  explanation  is  that  this  decrease  in  mean- square-error  is  a 
false-gain  effect  caused  by  the  narrow  frequency  bandwidth  of  the  data.  The 
second  interpretation  is  based  on  the  fact  that  a  data  point  in  a  narrowband  time 
series  can  be  well  predicted  using  the  recent  past  of  the  trace.  At  first  glance, 
this  observation  does  not  appear  to  apply  because  only  data  from  channels  1 
through  9  are  used  to  predict  channel  10.  However,  the  adaptive  filter,  by 
means  of  the  error  term  in  the  adaptive  algorithm,  is  influenced  by  the  chan¬ 
nel  10  data  values.  Thus,  indirectly,  the  adaptive -filter  prediction  does  use 
the  immediate  past  of  channel  10,  with  the  immediate  past  being  more  empha¬ 
sized  for  larger  values  of  k^.  This  phenomenon  will  be  discussed  further  in 
a  later  report. 

The  plot  of  mean- square-error  vs  time  in  Figure  III-8  for 

k  =  0.0015  resembles  the  Wiener  plot,  indicating  that  tie  data  are  stationary, 
s 

Figure  IIf-9  shows  the  channel  to  be  predicted  and  the  pre¬ 
diction  and  prediction  error  for  the  Wiener  and  large  k  and  small  k  cases 

s  s 

for  UBO  normal  noise.  An  interesting  point  of  comparison  between  Wiener 
and  adaptive  filtering  is  the  computational  requirements  of  each  method. 
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Figure  III-8.  Normal  Noise,  Mean-Square-Error  V<?  Time 
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On  the  IBM  7044,  the  total  time  to  design  and  apply  the  Wiener  filters  to 
UBO  normal  noise  was  30  min.  The  procedure  involved  five  separate  runs. 

One  run  of  three  adaptive  passes  through  the  data  would  require  less  than 
9  min  and  would  result  in  approximately  the  same  filters, 

3.  LASA  Subarray  B1 

Another  data  set  used  the  center  seismometer  and  the  first 
ring  of  LASA  subarray  Bl.  The  data  shown  in  Figure  III- 10  have  been 
antialias -filtered  and  resampled  to  a  sample  rate  of  100  msec.  Wiener 
filters,  20-points  long,  were  designed  to  predict  one  point  ahead  on  channel  1 
based  on  channels  1  through  7 .  The  resulting  normalized  mean-square-error 
was  0.  03 1. 

One  adaptive  filtering  run  of  eight  passes  was  made  on  these  data  with 

kg  values  of  0.  0015  (learning),  0.  0015,  0.001,  0.0005,  0.00025,  0.000175,  0.0001, 

0.00005,  and  0.002.  The  mean-square-error  -  vs  -  k  curve  (Figure  IB- 11) 

resulting  from  the  adaptive  filtering  of  these  data  has  the  same  concave- 

downward  shape  as  seen  for  UBO  normal  noise.  The  plot  for  k  s  0.  001 

s 

(Figure  III- 12)  resembles  the  k  =  0.  00005  curve  enough  that  the  data  can  be 
considered  time-stationary,  although  not  to  the  extent  of  the  UBO  normal  noise. 
The  question  of  a  concave -upward  or  concave-downward  shape  for  the  mean- 
square-error  -  vs  -  kg  curve  apparently  involves  the  time-stationarity  of 
the  data. 

The  power  spectrum  of  channel  1  (Figure  III- 13)  shows  no 

dominant  high  frequency  as  is  the  case  for  UBO  road  noise.  The  similarity 

in  the  spectra  of  the  k  =  0.  001  error  and  the  k  =  0.  00005  error  is  further 

s  s 

indication  of  the  stationarity  of  this  data  sample. 

Figure  III- 14  shows  the  Wiener  and  adaptive  filtering  results 
for  this  LASA  data  set. 
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4.  Array  Data 

The  13  channels  of  array  data  shown  in  Figure  III- 15  have 
been  prewhitened  and  resampled.  A  37-point  Wiener  filter,  with  output 
point  at  the  center,  was  designed  for  these  data  to  predict  channel  1  from 
channels  2  through  13.  The  resulting  normalized  mean- square-error  was 
0.  16. 


Starting  with  the  filter  weights  set  to  0,  one  adaptive- filtering 

run  having  six  passes  with  values  of  0.  0005  (learning),  0.  0005,  0.  00025, 

0.  000125,  0.  00005,  and  0.00075  was  made  on  these  data.  Figure  III  - 1 6 

shows  intermediate  values  of  k  resulting  in  errors  smaller  than  the  Wiener 

s 

mean-square-error.  In  Figure  III- 17,  the  dissimilarity  between  the  Wiener 

and  the  k  =  0.  0005  plots,  especially  in  the  first  part  of  the  data,  indicates 
s 

that  the  data  are  nontime-stationary.  (The  behavior  of  the  UBO  road  noise 
was  the  same  and  was  also  nontime  -  stationary. ) 

Figure  III- 18  shows  the  predictions,  the  prediction  error,  and 
the  channel  to  be  predicted  for  the  Wiener  and  large  k  and  small  k  filters. 

B  8 

B.  MAXIMUM-LIKELIHOOD  FILTERING 


To  compare  adaptive  maximum -likelihood  filtering  with 

conventional  maximum-likelihood  filtering,  the  same  basic  multichannel 

2 

data  used  by  SDL  in  their  conventional  maximum -likelihood  study  were 
used  for  our  adaptive  maximum-likelihood  work.  These  data,  which  came 
from  LASA  subarray  Cl,  consisted  of  19  of  the  possible  25  subarray  channels, 
the  six  seismometers  in  the  inner  ring  being  omitted.  A  3250-point,  100-msec 
sampling- period  data  segment,  which  included  the  signal  arrival  from  an 
Aleutian  Islands  event,  was  the  common  data.  The  time  traces  were  pre¬ 
pared  by  first  filtering  them  with  a  0.8-  to  2.8-cpr  bandpass  filter,  which 
was  thought  to  be  the  same  as  in  the  SDL  study,  *ud  then  time -shifting 
them  to  align  the  signal. 
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Figure 
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To  form  a  prediction  problem,  traces  2  through  19  were 
subtracted  from  channel  1,  yielding  18  difference  traces  that  were  normalized 
and  used  to  predict  channel  1,  which  was  also  normalized  to  a  variance  of  1. 

One-sided,  21-point  adaptive  maximum-likelihood  filters, 
similar  to  the  SDL  filters,  were  designed  by  two  methods.  The  first  , 
beginning  with  the  filter  weights  set  to  0,  included  three  passes  through  the 
data  interval  from  750  to  2250  points,  starting  with  k  =  0.  0005  for  the  first 

b 

pass  and  using  decreasing  values  of  k  (0.00025,  0.00005}  for  each  successive 

s 

pass.  At  the  end  of  the  third  pass,  the  filters  were  fixed  and  the  entire  data 
sample  was  filtered  with  these  fixed  filters.  The  SDL  conventionally  designed 
maximum-likelihood  filter  used  the  same  750-  to  2250-point  filtering  interval. 
The  second  method  began  with  filter  weights  of  0,  used  a  k  of  0.  00005,  and 

3 

let  the  filters  operate  on-line  (i.e.,  adapt  and  filter)  for  one  pass  through 
the  data. 

Figure  III- 19  shows  the  outputs  of  a  phased  sum,  the  con¬ 
ventional  maximum- likelihood  filter,  and  the  two  types  of  adaptive  maximum- 
likelihood  filters.  As  can  be  seen,  the  adaptively  designed  fixed  filter  i.? 
essentially  equivalent  in  performance  to  that  of  the  SDL-designed  filter.  The 
on-line  filter,  which  had  been  adapting  for  1725  points  at  the  beginning  of  the 
shown  trace,  is  about  3-db  poorer  than  the  off-line  filters. 

It  was  planned  to  have  a  quantitative  comparison  of  the  SDL 
filter  with  the  adaptively  designed  filter.  However,  the  frequency  response 
of  our  bandpass  filter  was  appreciably  narrower  than  that  of  the  SDL  band¬ 
pass  filter,  enough  so  that  measured  signal-to-noise  ratio  improvements 
have  little  meaning.  It  is  planned  to  repeat  this  experiment  using  the  SDL 
bandpass  filter  so  that  our  results  can  be  compared  in  a  more  precise 


manner. 
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Figure  III- 19.  Maximum- Likelihood  Filter  Outputs 


SECTION  IV 

CONCLUSIONS  AND  RECOMMENDATIONS 

From  the  theory  of  adaptive  filtering,  adaptive  prediction 

filtering  results  would  be  expected  to  show  several  things.  The  adaptive 

filter  should  approach  the  Wiener  filter  as  k  approaches  0.  This  con- 

s 

vergence,  which  was  not  explicitly  searched  for,  seems  to  be  true  experi¬ 
mentally. 

Another  expected  result  is  that  the  adaptive  mean-square- 
error  may  be  less  than  the  Wiener  mean- square-error  if  the  data  are  time- 
varying  but  should  always  be  greater  if  the  data  are  stationary.  The  excess 
mean-square-error  for  stationary  data  can  be  shown  to  result  from  random 
oscillations  of  the  filter  coefficients  about  their  optimum  values.  3  Smaller 

adaptive  mean-square-errors  for  nonstationary  data  are  produced  by  the 

ability  of  the  filters  to  track  the  changing  minimum  of  the  quadratic  error- 
squared  surface.  These  theoretical  expectations  seem  to  be  verified  in 
general  by  our  experimental  results.  The  exception  is  the  interesting  phe¬ 
nomenon  of  the  dip  in  mean-square-error  for  large  values  of  k  (just  before 
the  algorithm  becomes  unstable).  This  MSE  decrease,  which  is  thought  to 
be  false  gam  caused  by  the  narrow  frequency  bandwidth  of  the  data,  is  a 
subject  for  future  study.  The  final  expected  theoretical  result  is  that,  as 
kg  increases,  a  point  is  reached  where  the  algorithm  becomes  unstable. 

A  study  of  the  parameters  controlling  the  stability  of  the  algorithm  is  being 
made. 


The  following  summarizes  our  conclusions  and 


recommendations . 


•  Results  of  this  study  indicate  that,  in  the  limit 
as  kg  approaches  0,  the  resulting  adaptive  filler 
approaches  the  Wiener  filter;  therefore,  the 
adaptive  processing  scheme  could  be  of  value 
as  an  economical  means  of  Wiener  filter  design 


•  As  data  statistics  change,  the  optimum  value 
of  kg  changes;  therefore,  the  investigation  of 
methods  of  varying  ks  with  changing  data  statistics 
is  recommended 

•  Some  data  samples,  when  filtered  adaptively, 
result  in  concave -downward  mean-square-er  ror- 

vs'ks  curves  while  other  data  samples  result  in 
a  concave -upward  curve;  preliminary  results 
given  in  this  report  indicate  that  the  data  char¬ 
acteristic  which  determines  the  shape  of  this 
curve  is  related  to  the  time-stationarity  of 
the  data 

•  Adaptive  maximum-likelihood  filtering  results 
indicate  that  this  type  of  filtering  can  be  done 
with  much  less  time  and  expense  than  required 
by  conventional  means 

•  The  inclusion  of  methods  of  extending  the  adaptive 
filtering  concepts  to  the  problem  of  signal  extraction 
based  on  a  theoretical  signal  model  is  recommended 
for  any  future  study 


•  Only  one  prewhitened  sample  is  included  in  the  data 
processed  here;  the  other  three  samples  will  be  pre¬ 
whitened  and  adaptively  filtered  by  the  same  pro¬ 
cedure  used  on  the  raw  data  in  this  report  in  order 
to  determine  the  effects  of  prewhitening  on  adaptive 
filtering,  and  a  later  report  will  present  these  results 
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